116 research outputs found
Depth from Refraction Using a Transparent Medium with Unknown Pose and Refractive Index
published_or_final_versio
RIGID: Recurrent GAN Inversion and Editing of Real Face Videos
GAN inversion is indispensable for applying the powerful editability of GAN
to real images. However, existing methods invert video frames individually
often leading to undesired inconsistent results over time. In this paper, we
propose a unified recurrent framework, named \textbf{R}ecurrent v\textbf{I}deo
\textbf{G}AN \textbf{I}nversion and e\textbf{D}iting (RIGID), to explicitly and
simultaneously enforce temporally coherent GAN inversion and facial editing of
real videos. Our approach models the temporal relations between current and
previous frames from three aspects. To enable a faithful real video
reconstruction, we first maximize the inversion fidelity and consistency by
learning a temporal compensated latent code. Second, we observe incoherent
noises lie in the high-frequency domain that can be disentangled from the
latent space. Third, to remove the inconsistency after attribute manipulation,
we propose an \textit{in-between frame composition constraint} such that the
arbitrary frame must be a direct composite of its neighboring frames. Our
unified framework learns the inherent coherence between input frames in an
end-to-end manner, and therefore it is agnostic to a specific attribute and can
be applied to arbitrary editing of the same video without re-training.
Extensive experiments demonstrate that RIGID outperforms state-of-the-art
methods qualitatively and quantitatively in both inversion and editing tasks.
The deliverables can be found in \url{https://cnnlstm.github.io/RIGID}Comment: ICCV202
Semi-supervised Cycle-GAN for face photo-sketch translation in the wild
The performance of face photo-sketch translation has improved a lot thanks to
deep neural networks. GAN based methods trained on paired images can produce
high-quality results under laboratory settings. Such paired datasets are,
however, often very small and lack diversity. Meanwhile, Cycle-GANs trained
with unpaired photo-sketch datasets suffer from the \emph{steganography}
phenomenon, which makes them not effective to face photos in the wild. In this
paper, we introduce a semi-supervised approach with a noise-injection strategy,
named Semi-Cycle-GAN (SCG), to tackle these problems. For the first problem, we
propose a {\em pseudo sketch feature} representation for each input photo
composed from a small reference set of photo-sketch pairs, and use the
resulting {\em pseudo pairs} to supervise a photo-to-sketch generator
. The outputs of can in turn help to train a sketch-to-photo
generator in a self-supervised manner. This allows us to train
and using a small reference set of photo-sketch pairs
together with a large face photo dataset (without ground-truth sketches). For
the second problem, we show that the simple noise-injection strategy works well
to alleviate the \emph{steganography} effect in SCG and helps to produce more
reasonable sketch-to-photo results with less overfitting than fully supervised
approaches. Experiments show that SCG achieves competitive performance on
public benchmarks and superior results on photos in the wild.Comment: 11 pages, 11 figures, 5 tables (+ 7 page appendix
SCNet: Learning Semantic Correspondence
This paper addresses the problem of establishing semantic correspondences
between images depicting different instances of the same object or scene
category. Previous approaches focus on either combining a spatial regularizer
with hand-crafted features, or learning a correspondence model for appearance
only. We propose instead a convolutional neural network architecture, called
SCNet, for learning a geometrically plausible model for semantic
correspondence. SCNet uses region proposals as matching primitives, and
explicitly incorporates geometric consistency in its loss function. It is
trained on image pairs obtained from the PASCAL VOC 2007 keypoint dataset, and
a comparative evaluation on several standard benchmarks demonstrates that the
proposed approach substantially outperforms both recent deep learning
architectures and previous methods based on hand-crafted features.Comment: ICCV 201
- …